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ABSTRACT 


Some  procedures  for  discriminating  between  earthquakes  (Q)  and  explosions  (X)  set  aside  a 
region  of  the  discrimination  parameter  space  (x)  in  which  no  decision  is  made;  this  may  be  called 
the  “unidentified”  or  “undecided”  region,  (U).  The  existing  statistical  literature  seems  not  to 
explicitly  provide  any  such  option,  although  an  undecided  region  may  be  rigorously  supported  by 
the  decision  theory  literature  on  “action  spaces.”  In  this  report  we  show  how  the  concept  of  U 
arises  naturally  from  the  concept  of  the  costs  of  classification:  positive  costs  from  misclassifica- 
tions  and  “no  decisions”,  and  negative  costs  (benefits)  from  correct  classifications.  The  resulting 
approach  is  a  generalization  of  the  “Expected  Cost  of  Misclassification”  (ECM)  approach;  we  call 
the  generalization  the  “Expected  Cost  of  Classification”  (ECC)  approach. 

We  also  show  how  thresholds  for  detecting  X  as  outliers  of  a  Q  population  may  be  derived 
from  cost  considerations  together  with  uniform  distributions  for  X,  and,  together  with  plausible 
prior  probabilities  for  Q  and  X,  lead  to  reasonable  thresholds  in  realistic  scenarios. 

Existing  procedures  also  are  sometimes  one-sided,  theoretically  giving  decisions  only  if  the 
event  is  actually  Q.  For  example,  if  an  event  is  deep,  it  is  Q;  but  if  shallow,  as  is  required  for  X 
and  is  possible  for  Q,  there  is  no  decision.  We  show  how  a  similar  concept  can  emerge  naturally 
from  the  concept  of  different  population  variances  for  Q  and  X;  the  linear  discriminant  no  longer 
applies,  and  the  problem  lies  in  the  domain  of  quadratic  discriminants  such  that  there  may  be 
unconnected  Q  regions,  X  regions,  and  U  regions.  One  also  finds  that  one  may  have  Q  and  U 
regions,  but  no  X  region. 

In  the  fundamental  approach  to  classification  taken  in  this  report,  inversions  of  the  estimated 
population  covariance  matrices  are  not  required,  whereas  they  are  in  classical  linear  discrimina¬ 
tion.  The  more  direct  ECC  approach  simply  involves  estimation  of  the  population  descriptive 
parameters,  such  as  means  and  variances,  followed  by  direct  determination,  at  each  point  of  the 
decision  space,  of  that  population  which  has  the  lowest  point  ECC.  The  ECC  is  a  function  of  the 
population  means  and  variances,  misclassification  costs,  “no  decision”  costs,  the  (possibly  nega¬ 
tive)  costs  (benefits)  of  correct  classification,  and  the  prior  probabilities  of  Q  and  X. 
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Application  of  the  method  to  a  set  of  plausible  International  Monitoring  Community  parame¬ 
ters  results  in  80-90%  probability  of  detection,  one  false  alarm  per  year,  and  with  a  total  cost  30- 
40%  greater  than  the  total  benefits.  For  a  system  with  better  discrimination  power,  benefits  would 
more  nearly  equal  costs. 


INTRODUCTION 


All  discrimination  studies  of  which  I  am  aware  in  the  seismic  literature,  e.g.,  Taylor  et  al.  (1989), 
rely  on  linear  discrimination.  Linear  discrimination  assumes  that  the  variance  matrices  of  the  dis¬ 
criminants  are  equal  for  Q  and  X,  and  that  the  costs  for  correct  classification  are  zero.  In  addition, 
most  studies  implicitly  assume  that  the  prior  probabilities  of  Q  and  X  are  equal;  that  the  misclassifica- 
tion  costs,  c(Q|X)  and  c(X|Q),  are  equal;  and  that  the  costs  (benefits)  for  correct  classification  are 
zero.  For  the  theory  of  discrimination  used  in  this  memorandum,  see  Chapter  1 1  of  Johnson  and 
Wichem  (1998). 

However,  some  discriminants  as  implemented  in  this  way  may,  in  actual  application,  have  a  high 
false  alarm  rate,  e.g.,  10%  of  Q  may  be  classified  as  X,  or  10%  of  X  may  be  classified  as  Q.  In  addi¬ 
tion  to  political  costs,  such  events  may  result  in  costs  for  extensive  additional  technical  work  when  it 
later  becomes  clear  that  a  mistake  was  made.  Thus,  it  is  plausible  that  it  would  be  useful  to  classify 
an  event  as  “not  discriminated”  or  “unidentified”  if  there  were  not  high  confidence  that  the  event  is  Q 
or  X,  and  if  the  costs  of  a  failure  to  identify  were  less  than  that  of  an  incorrect  identification. 

Of  course,  there  is  some  cost,  c(U|X)  or  c(U|Q),  for  each  “no  decision”  in  that  further  work,  using 
other  methods,  must  immediately  be  put  into  many  such  events.  (In  this  report  we  use  the  notations 
c(A|B)  and  “cab”  interchangeably.)  There  would,  of  course,  be  less  work  put  into  a  small,  unidenti¬ 
fied  event  in  an  uninteresting  region  than  into  a  large  unidentified  event  near  a  test  site.  We  shall  see 
that  these  differences  can  result  in  different  decision  and  “no  decision”  thresholds  as  a  function  of 
event  magnitude  if  the  objective  is  to  minimize  cost. 

In  addition,  there  may  be  negative  costs  or  “benefits”,  c(X|X)  and  c(Q|Q).  It  seems  plausible  that 
there  would  usually  be  a  much  higher  benefit  (more  negative  cost)  for  correctly  identifying  X  than  for 
correctly  identifying  Q. 

Also,  the  costs  of  the  overall  discrimination  process  must  be  allocated  in  some  manner,  and  it  is 
plausible  to  do  so  by  allocating  a  small  cost  to  every  event  analyzed.  The  dominant  positive  costs, 
summed  over  all  events  will,  in  most  cases,  be  c(Q|Q)  and  c(U|Q)  because  of  the  large  number  of  Q 
events.  In  a  “worthwhile”  system  one  would  expect  the  benefits,  e.g.,  c(X|X)  integrated  over  all  X,  to 
approximately  cancel  the  total  positive  costs. 
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It  is  important  to  note  that  proof  of  the  classical  result  that  classification  regions  are  deter¬ 
mined  by  the  maximum  prior-weighted  probability,  an  intuitively  appealing  result  (e.g.,  see 
Johnson  and  Wichem,  1998),  depends  not  only  on  the  assumption  of  equal  misclassification  costs 
and  on  zero  values  for  c(Q|Q)  and  c(X|X),  but  also  on  the  assumption  that  no  other  actions,  such 
as  U,  are  possible.  If  all  of  these  assumptions  are  not  suitable  to  the  case  of  interest,  then  many 
classical  discrimination  approaches  used  in  the  literature  will  not  be  optimal  with  respect  to  cost. 

If  policy  experts  feel  that  the  expected  total  political  cost  of  a  system’s  poor  performance  is 
too  high,  they  can  provide  greater  system  resources,  which  should  improve  discrimination  and 
may  lower  overall  costs  since,  although  the  system  costs  will  likely  increase,  the  political  costs 
may  decrease  even  more. 

The  initial  analysis  cost  for  each  event,  which  is  the  cost  of  the  overall  discrimination  system 
operation  as  allocated  equally  to  each  event,  can  be  seen  in  the  mathematics  not  to  affect  the  dis¬ 
crimination  thresholds  for  a  fixed,  particular  system.  From  a  larger  point  of  view,  however, 
increased  system  costs  should  result  in  a  different  and  better  system  with  respect  to  discrimina¬ 
tion,  e.g.,  there  may  be  larger  differences  between  the  means  or  smaller  population  variances. 
Thus,  the  thresholds  may  change  as  system  costs  increase  because  the  system  itself  changes. 

The  equally  allocated  cost  does  affect  the  total  cost  and,  hence,  can  be  an  element  to  consider 
in  choosing  between  different  systems. 

A  number  of  studies  of  regional  P/S  discriminants  show  considerable  overlap,  especially  at 
frequencies  below  5  Hertz  (Hz).  However,  it  is  often  the  case  that  a  few  Q  may  be  identified  by 
their  especially  low  values.  (See,  e.g.,  Figure  7  in  Taylor  et  al.,  1989,  who  remark  that  the  Q  pop¬ 
ulation  shows  substantially  larger  “scatter”  than  the  X  population.)  This  is  reasonable  on  various 
physical  grounds,  and  may  be  modeled  by  assuming  that  the  Q  variance  is  higher  than  the  X  vari¬ 
ance.  We  shall  see  that  this  implies  that  linear  discrimination  does  not  apply,  that  some  Q  may  be 
identified,  but  not  X,  and  that  there  is  a  large  unidentified  (U)  region  of  discrimination  space. 

Although  we  discuss  only  a  single  discrimination  variable  in  this  memorandum,  it  can  be  the 
case  that  multiple  variables  can  be  combined  into  a  single  variable.  For  example,  Fisk  et  al. 
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(1995),  Bottone  et  al.  (1996),  and  Murphy  et  al.  (1997)  have  shown  how  Ms,  mb,  and  the  standard 
errors  of  both  variables  can  be  combined  into  a  single  variable.  This  variable  could  be  treated  by 
the  techniques  of  this  report. 

Also,  we  may  note  that  there  is  nothing  intrinsically  one-dimensional  to  the  approach  in  this 
report;  the  probability  densities,  not  necessarily  normal,  may  be  evaluated  in  two,  three  or  more 
dimensions,  and  the  same  basic  techniques  utilized. 

CASES  OF  INTEREST 

Figure  1  shows  the  population  probability  distributions  for  several  cases  analyzed  in  this  report. 
Figure  la  illustrates  the  case  of  a  rather  poor  discriminant  in  which  both  variances  are  equal.  The 
discriminant  is  poor  because  the  standard  errors  are  not  small  compared  to  the  difference  between 
the  means. 

The  figure  shows  two  normal  distributions,  p(x),  with  means  of  -1  and  1  and  a  common  variance 
of  1,  representing  earthquakes  (Q)  and  explosions  (X).  To  help  fix  ideas,  the  horizontal  axis,  x,  may 
be  thought  of  as  a  log(P/S)  ratio  which  is  smaller  for  Q  than  for  X. 

The  standard  linear  discriminant,  assuming  equal  prior  probability,  p,  and  equal  costs  of  mis- 
classification,  c,  would  classify  all  events  for  which  x  >  0  as  X  and  for  x<0  as  Q.  The  false  alarm 
rate  for  X,  P(X|Q),  would  be  the  cumulative  normal,  phi(mu,  sigma,  x)  =  phi(-l,l,0)=0.16;  and  the 
probability  of  detection  for  X,  P(X|X),  would  be  l-phi(l,l,0)=0.84. 

Figure  lb  illustrates  the  case  of  a  good  discriminant  in  which  both  variances  are  equal.  The  dis¬ 
criminant  is  good  because  the  standard  errors,  0.5,  are  small  compared  to  the  difference  between  the 
means,  2.0.  In  this  case,  the  false  alarm  rate  would  be  0.02  and  the  probability  of  detection  0.98. 

Figure  1  c  illustrates  the  case  in  which  the  variances  are  unequal:  1 .0  and  0.5.  In  this  case,  the 
standard  error  of  Q  is  twice  that  of  X  and  the  linear  discriminant  cannot  be  applied.  We  shall  ana¬ 
lyze  this  situation  further  below. 

Figure  Id  illustrates  the  case  where  there  are  three  populations.  To  fix  ideas  we  may  imagine 
that  the  middle  population  represents  mining  explosions  (M)  which  may  have  discrimination  values 
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generally  intermediate  between  those  of  Q  and  X.  Here  we  have  chosen  to  show  all  three  popula¬ 
tions  with  small  standard  deviations,  0.5.  However,  it  is  clear  that  the  M  population  of  events  com¬ 
plicates  the  problem  of  identifying  explosions.  Plus,  an  entire  new  family  of  costs  must  be 
specified,  i.e.,  c(X|M),  c(M|X),  c(U|M),  c(M|M),  c(M|Q),  c(Q|M).  Presumably  these  last  three  or 
four  costs  would  be  small,  compared  to  the  first  two. 

In  each  of  the  cases  in  Figure  1,  the  X  population  could  be  replaced  by  a  uniform  distribution 
over  some  range.  This  might  reflect  a  case  where  there  have  been  no  known  nuclear  explosions  in  a 
region  of  interest  and  so  we  imagine  that  we  have  little  idea  what  values  of  x  would  be  appropriate 
for  an  explosion,  and  so  assume  a  uniform  distribution  over  a  large  region  of  x.  We  shall  show  that 
in  that  case,  analysis  produces  results  very  analogous  to  those  obtained  by  outlier  analysis  (Taylor 
and  Hartse,  1997;  Fisk  et  al.,  1996)  in  which  only  the  false  alarm  rate  is  controlled.  Then,  going 
further,  we  can  show  explicitly  what  the  true  costs  and  probabilities  of  detection  appropriate  for  that 
false  alarm  rate  would  be  if  the  true  X  distribution  over  x  was  some  specific  function. 

In  subsequent  sections  we  shall  take  up  these  various  cases  of  interest  and  examine  how  the 
thresholds  and  undecided  regions  change  as  the  costs  and  priors  are  varied.  We  will  also  examine  a 
set  of  costs  and  priors  which  seem  plausibly  realistic  and  observe  how,  as  a  result,  thresholds  could 
plausibly  change  as  a  function  of  magnitude,  and  how  they  might  change  near  test  sites  and  mines 
where  priors  change  from  those  of  a  worldwide  average. 

But  first  it  is  necessary  to  outline  the  theory  for  the  Expected  Cost  of  Classification  (ECC) 
method. 

THEORY:  EXPECTED  COST  OF  CLASSIFICATION  (ECC) 

In  order  to  motivate  the  theory,  let  us  return  to  the  case  discussed  in  Figure  la.  A  user,  applying 
this  discriminant,  would  likely  find  the  costs  of  16%  false  alarms  unacceptable  and,  if  he  applied  the 
discriminant  at  all,  would  actually  decide  that  an  event  was  a  Q  or  X  only  if  it  had  a  small  or  large 
value,  respectively,  of  x.  Thus,  implicitly,  the  analyst  would  be  implementing  an  undecided  region. 

One  formal  approach  to  this  problem  would  be  to  first  find  a  value  of  x  for  which  P(X|Q)  is  an 
acceptable  value,  e.g.,  0.01.  This  value  for  x  is  x=1.33;  i.e.,  l-phi(-l,l,1.33)=0.01,  where  phi  is  the 
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cumulative  normal  distribution.  Thus,  an  explosion  is  declared  only  if  x>l  .33.  Then  we  find  the 
value  of*  for  which  P(Q|X)  is  <  0.01,  i.e.,  -1.33.  The  interval  (-1.33,  +1.33)  would  comprise  an 
unidentified  (U)  region.  P(X|X)  under  this  procedure  is  l-phi(l,l,1.33)=0.37.  The  identical  num¬ 
bers  apply  to  Q. 

So,  only  0.37  of  events  are  identified,  instead  of  the  0.84  with  the  standard  linear  discriminant, 
but /has  been  reduced  to  0.01  from  0.16.  The  obvious  question  is:  what  are  the  relative  costs  of 
false  alarms  and  unidentified  events? 

Johnson  and  Wichem  (1998)  give  the  fundamental  criterion  for  discrimination  of  multiple 
populations  from  which  special  cases  may  be  derived  for  equal  covariance,  equal  prior  probabil¬ 
ity,  equal  misclassification  costs,  zero  benefits  for  correct  classification,  and  only  two  populations. 
(The  case  usually  treated  typically  makes  all  these  assumptions.) 

Theory  (e.g.,  see  Ferguson,  1967)  shows  that  classification  may  be  generalized  through  the 
concept  of  “action  spaces”.  In  this  generalization,  the  decisions  to  be  made  are  not  explicitly  the 
identity  of  the  true  event  type  given  x,  but  instead  the  action  to  be  taken,  given  x. 

With  this  generalization,  the  fundamental  result  is  that  the  optimal  classification  procedure 
amounts  to  choosing  mutually  exclusive  and  exhaustive  action/classification  regions,  Rj,  such  that 
the  prior-probability-weighted  expected  cost  of  classification  of  all  populations  is  minimized. 

It  may  be  proved  that  this  may  be  done  by  allocating  the  event  generating  x  to  the  kth  action 
for  which  the  point  cost,  Q, 


Ck (x)  =  ^p.  •  f(x)i  •  c(k\ i )  (!) 

i 

is  smallest.  The  range  (not  necessarily  contiguous)  of  x  over  which  the  kth  expression  is  the  min¬ 
imum  is  Rfc.  Similarly,  a  particular  jc  may  be  determined  to  be  in  that  R^  for  which  the  kth  expres¬ 
sion  is  the  minimum  over  all  k  expressions.  (Since  c(i|i)  is  often  assumed  to  be  zero,  (1)  is  often 
expressed  as  a  sum  for  which  i  is  not  equal  to  k.  We  shall  discuss  this  aspect  further  below.) 
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Here  fj  is  the  probability  density  distribution  of  population  i;  c  is  the  cost  if  the  event  generating  x  is 
in  population  i  and  results  in  action  k;  and  pj  is  the  prior  probability  of  an  event  being  in  population  i. 

In  (1),  i  represents  true  event  type,  e.g., 

explosion  X 

earthquake  Q 

mine  blast  M 

and  k  represents  a  possible  action,  e.g., 

decide  X 

decide  Q 

decide  M 

decide  U  (unidentified) 

To  further  fix  thoughts  let  us  consider  again  Figure  la  in  which  we  see  two  normal  distributions 
with  equal  variance  of  1 .0  and  means  of  -1 .0  for  Q  and  +1.0  for  X.  Thus,  we  have  Q  and  X  true  event 
types;  let  us  assume  that  the  actions  are  “decide  Q”,  “decide  X”,  and  U,  and  that  we  set  the  costs  as 
c(Q|X)=c(X|Q)=  0.5,  and  c(U|X)=c(U|Q)=0.1.  It  is  important  to  note  that  we  have  set  the  costs  for  U  to 
be  less  than  the  costs  for  misidentification.  In  this  case  we  also  set  c(Q|Q)=c(X|X)=0. 

Note  that,  in  view  of  c(Q|Q)=c(X|X)=0,  the  only  density  function  in  Cq  is  fx,  and  in  Cx,  /q- 

To  the  left  of  x=0  in  Figure  la,  the  smallest  point  cost  will  be  Cq,  since  there  fx(*)  is  small  and  Cq 
contains  no  fq(x),  which  would  be  large.  Therefore,  to  the  far  left  we  have  Rq,  as  is  appropriate  since 
the  Q  population  dominates  for  negative  values  of  x.  Similarly,  to  the  far  right  we  have  Rx-  Symmetri¬ 
cally  in-between  we  would  have  Ry>  if  it  exists.  (There  is  no  Ry  if  all  the  misidentification  costs  are 
equal,  because  in  that  case  Cv  is  the  sum  of  Cq  and  Cx,  which  are  both  positive  in  this  case,  so  that 
there  is  no  x  for  which  Cv  is  the  minimum.) 

Note  that,  in  this  analysis,  no  use  has  been  made  of  the  assumption  that  the  distributions  are  normal, 
have  equal  variances,  etc.,  and,  in  fact,  the  analysis  is  completely  general  for  any  distribution.  We  shall 
see  that  more  complex  sets  of  distributions  simply  result  in  more  complex  decision  regions. 
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EQUAL  VARIANCE  FOR  Q  AND  X 


Using  the  decision  cost  parameters,  0.5  and  0.1,  discussed  above,  together  with  equal  priors, 
Px=Pq=0.5,  we  may  plot  the  point  cost  expressions  (1)  as  seen  in  Figure  2a.  We  find  that  Cq=Cu 
at  *£=-0.70.  By  symmetry  the  undecided  region  is  between  -0.70  and  +0.70. 

The  total  expected  cost  for  one  event  (note  px+ p  0=1.0)  is  c=0.056,  calculated  as  the  area 
beneath  the  lowest  point  cost  curves.  It  may  be  explicitly  calculated  that  this  cost  is  less  than  the 
cost  would  be  if  there  were  no  undecided  region  or,  indeed,  for  any  other  smaller  or  larger  unde¬ 
cided  region.  Thus,  as  the  theory  guarantees,  we  have  found  the  minimum  cost  set  of  regions. 
The  false  alarm  rate  (/)  is  0.044  and  the  probability  of  detection  (d)  is  0.6  (compare  to  0.16  and 
0.84  for  the  linear  discriminant). 

The / and  d  values  for  the  minimum  cost  solution  are  the  probability  that  a  single  member  of  Q 
will  be  a  false  alarm,  and  that  a  single  member  of  X  will  be  detected,  given  the  calculated  thresh¬ 
olds.  They  are  not  weighted  by  priors,  and  cost  coefficients  have  no  influence,  given  the  calcu¬ 
lated  thresholds,  which  were,  of  course,  calculated  using  cost  information.  With  minimum 
expected  classification  cost  as  the  optimality  criterion,  values  of f  and  d  are  no  longer  directly 
determinative  optimality  measures,  although  they  are,  of  course,  of  interest  and  should  have  “rea¬ 
sonable”  values. 

If  we  consider  the  case  where  c(U|X)=c(U|Q)=0.5,  the  same  as  c(Q|X)  and  c(X|Q),  then  there 
is  no  undecided  region  and  the  cost  is  calculated  to  be  0.08.  As  expected,  this  cost  is  greater  than 
with  the  optimally  determined  undecided  region. 

The  two  cases  discussed  above  are  tabulated  in  rows  1  and  2  of  column  1  of  Table  1 . 

It  is  clear  that  if  the  cost  of  an  unidentified  event  is  small,  then  minimization  of  total  cost  will 
result  in  a  large  undecided  region.  As  an  example  of  this  behavior  we  can  reduce  c(U|Q)=c(U|X) 
to  0.02  and  find  that  the  undecided  region  is  now  [-1.6, +1.6],  and  that  the  total  cost,  c,  has  been 
reduced  to  0.017  from  0.056.  Of  course,  while /is  now  only  0.0046,  d  is  reduced  to  only  0.27 
instead  of  being  0.62  or  0.84. 
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In  the  next  section  we  introduce  the  concept  of  a  negative  cost,  or  benefit,  c(X|X)=-0.5,  and 
examine  how  this  changes  the  respective  thresholds.  The  results  of  the  calculations  are  given  in 
Figure  2b  and  Table  1  (row  3,  column  1). 


Table  1:  Effects  of  Negative  Costs  (Benefits),  and  Unequal  Priors  and  Variances 


Fig 

2,3 

For  each  case:  muq=-1.0;  mux=+1.0;  cqx=cxq=0.5;  cqq=0.0 

sigq=1.0,  sigx=1.0  (Figure  2) 

sigq=.5,  sigx=.5 

sigq=1.0,  sigx=. 5  (Figure  3) 

cuq=  5,  cux=.  5,  cxx=0,  px=  5,  pq=.  5 
Q:0.0:X 

C=.080,/=.16,  d=M 

Q:0.0:X 

C=.01 1,/=.02,  d=.915, 

Q:0.18:X:3.18:Q 

O.042,/=.12,^=95 

a 

cuq=  1,  cux=  1,  cxx=0,  px=.5,  pq=  5 
Q:-0.7:U:+.7:X 

C=.056jk044,  d=.  62 

Q:-.16:U:.18:X 

c=.mi,f=.m,  d=.  95 

Q:-.l  :U:.52:X:2.82:U:3.46:Q 
C=.033,/=.064,  d=. 83 

n 

cuq=.l,  cux=  1,  cxx=-.5,  px=  5,  pq=  5 
Q:-.68:U:-.2:X 

C=-A4,f=2l,d=M  f 

Q:-.16:U:-.04:X 
C=-.23,/=027,  d=.  98 

Q:-.l  :U:.l  :X:3.26:U:3.46:Q 
C=-.20,/=.14,  d=. 96 

c 

cuq=.l,  cux=.l,  cxx=03  px=0.09,  pq“0.  91 
Q:.46:U:1.86:X 

C=.025,-/=.002,  d=.  1 9 

Q:.12:U:.48:X 

C=.0045,/=.0015,^=.85 

Q:.4:U:2.96:Q 

C=.02,/=.08)  d=0 

D 

cuq=  5,  cux=  5,  cxx=-.5,  px-0.09,  pq=0.  91 
Q:.46:U:.96:X 

C=.0073,T=  025,  </=.  52 

Q:.12:U:.24:X 

C=-.037,/=.0065,</=.94 

Q:.4:U:.68:X:2.66:U:2.96:Q 

C=-.0028, 7=046,  d=.  74 

Table  Notes:  As  examples,  Q:-.7:U:.7:X  indicates  that  Q  is  decided  for  x<  -.7,  U  is  decided  for 
-.7  <  x  <  0.7,  and  X  is  decided  for  x  >  +.7;  cxq  is  the  classification  cost  for  X  given  Q;  pq  is  the 
prior  probability  for  Q;  c  is  cost.;/is  the  false  alarm  rate  due  to  a  single  member  of  Q;  d  is 
probability  of  detection  of  a  single  member  of  X.  Columns  3  and  4  have  the  same  values  for 
cuq,  cux,  cxx,  px,  and  pq  as  does  the  corresponding  row  in  column  2. 


It  is  interesting  to  note,  in  Figure  2b,  that  for  some  ranges  of  x  some  point  costs  from  equa¬ 
tions  (2)  are  negative  due  to  the  existence  of  negative  conditional  costs  (benefits).  The  least, 
including  negative  values,  prior-weighted  point  cost  determines  the  action  or  decision  at  each 
point  in  parameter  space. 

The  thresholds,  as  seen  in  Figure  2b,  are  now  asymmetrical,  [-.68, -.2],  expanding  Rx,  instead 
of  [-.7, +.7]  due  to  the  fact  that  there  is  a  benefit  for  correctly  identifying  X,  but  not  for  correctly 
identifying  Q,  (c(Q|Q)=0).  There  is  still  an  undecided  region,  U,  for  the  minimum  cost  solution. 
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In  the  above  calculation  we  have  assumed  equal  cost  for  classifying  Q  and  X  as  U;  i.e., 
cuq=cux=0. 1 .  In  general,  we  would  expect  a  greater  cost  for  failure  to  identify  X  than  for  failure 
to  identify  Q.  Repeating  the  calculation  for  Figure  2b,  but  with  cux=0.25  and  cux=0.05,  we  find 
only  a  modest  shift  in  the  thresholds:  Q:-.8:U:-.24:X  instead  of  Q:-.68:U:-.2:X.  This  may  be 
interpreted  as  follows:  as  the  cost  for  failure  to  identify  X  becomes  greater,  the  integral  of  the 
explosion  density  over  the  unidentified  region  is  reduced;  and  as  the  cost  for  failure  to  identify  Q 
becomes  less,  the  integral  of  the  earthquake  density  over  the  unidentified  region  is  increased. 

In  the  above  calculation,  we  have  retained  the  assumption  of  equal  prior  probability,  while 
making  the  costs  for  correct  identification  of  Q  and  X  asymmetrical.  Now  let  us  investigate  the 
case  where  the  costs  of  correct  identification  are  restored  to  zero,  but  the  prior  probability  of  Q  is 
10  times  that  of  X. 

Again,  the  least-cost  thresholds  are  seen,  in  Figure  2c  and  row  4  of  Table  1,  to  become  asym¬ 
metrical,  [0.46, 1 .86],  in  this  case,  as  would  be  intuitively  expected,  greatly  expanding  the  range 
of  Rq,  shifting  the  undecided  region  toward  the  explosion  population,  and  reducing  the  range  of 

Rx- 


While  the  cost  is  minimized  for  Figure  2c,  and  the  false  alarm  probability  is  low,  the  probabil¬ 
ity  of  detection  is  poor,  0.19.  This  is  an  unsatisfying  result;  results  of  this  sort  have  very  likely  led 
many  researchers  and  policy  experts  to  discount  the  use  of  priors.  It  seems  counterintuitive  to 
make  it  difficult  to  identify  an  explosion,  just  because  there  are  many  more  earthquakes. 

The  reason  that  such  analyses  give  such  unsatisfactory  results  is  that  they  do  not  take  into 
account  the  cost  benefits  of  correctly  identifying  explosions,  especially  as  compared  to  the  cost 
benefits  of  correctly  identifying  earthquakes.  As  we  saw  above,  such  cost  considerations  shift  the 
threshold  toward  identifying  more  explosions.  Only  a  proper  balance  of  costs  and  priors  will 
yield  a  reasonable  result. 

It  is  likely  such  weaknesses  in  existing  theory  have  lead  many  users  to  rely  directly  on  their 
intuition  in  setting/ and  d  thresholds  directly,  instead  of  performing  a  more  fundamental  analysis. 
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In  Figure  2d,  we  combine  the  two  effects  of  a  higher  prior  for  Q  and  higher  benefit  for  identi¬ 
fying  X,  and  find  the  thresholds  at  [0.46, 0.96].  We  see  that  the  range  of  x  over  which  explosions 
are  identified  has  increased  substantially;  all  X  with  x  greater  than  the  explosion  mean  are  now 
identified  as  explosions;  the  probability  of  detection  has  increased  to  0.52. 

The  foregoing  analyses,  with  standard  errors  of  1 .0,  are  repeated  in  the  corresponding  rows  of 
column  2  of  Table  1  for  standard  errors  of  0.5.  Some  results  of  interest  are  that  the  undecided 
zones  are  smaller,  and  the  costs  are  lower. 

It  is  natural  that  the  costs  are  lower  because  there  are  fewer  errors  made.  However,  this  result 
is  somewhat  misleading  because  it  is  to  be  expected  that  to  attain  a  better  discrimination  capabil¬ 
ity  (smaller  standard  error),  more  funding  might  have  to  be  provided  by,  for  example,  building 
more  seismic  arrays  or  investing  in  better  processing  capability.  These  costs  would  have  to  be 
allocated  to  the  overall  system  and  would  perhaps  best  be  specified  as  an  equal  cost  for  every  ana¬ 
lyzed  event. 

In  particular,  a  cost  such  as  c(Q|Q)=0.01,  when  allocated  equally  to  each  event  and  then 
summed  over  1000  Q  events  and  one  or  two  X  events,  would  massively  affect  the  total  cost.  This 
cost  would  not,  however,  change  any  thresholds,  or  ford  values,  unless,  along  with  the  cost,  the 
mean  and  variance  parameters  changed.  Of  course,  if  funds  were  invested  in  a  system,  one  would 
expect  that  the  discrimination  parameters  would  improve  and,  in  that  case,/ and  d  would  also  be 
expected  to  improve,  and  costs,  other  than  system  enhancement  costs,  would  decline. 

If  the  costs  of  false  alarms  were  high  enough,  then  the  sum  of  the  reduced  cost  of  a  reduced 
number  of  false  alarms,  plus  increased  costs  of  system  enhancement  to  develop  better  discrimina¬ 
tion,  could  decline.  We  will  discuss  this  point  in  more  detail  below. 

A  REALISTIC  INTERNATIONAL  MONITORING  COMMUNITY 

EXAMPLE 

Let  us  consider  the  situation  where  we  are  monitoring  a  large  land  mass  where  approximately 
1000  events  of  interest,  mostly  earthquakes,  occur  per  year.  We  may  imagine  that  the  cost  of  the 
monitoring  system  is  approximately  $10M,  so  that  cqq=$0.01M  processing  cost  per  event.  From 
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a  policy  point  of  view  we  assume  that  there  could  be  one  explosion  per  year  in  the  region  of  inter¬ 
est.  If  we  knew  that  there  were  no  events,  we  would  not  be  maintaining  the  monitoring  system. 
An  alternative  assumption  might  be  one  explosion  every  10  years,  or  0.1  explosions  per  year. 

We  imagine  that  we  have  a  fairly  good  discriminant  capability,  one  as  seen  in  Figure  lb, 
where  the  standard  errors  are  0.5. 

Since  we  may  presume  that  the  policy-maker’s  principle  interest  in  maintaining  the  monitor¬ 
ing  system  is  to  detect  explosions,  we  set  the  benefit  from  detecting  an  explosion  at  the  total 
expense  of  monitoring,  i.e.,  cxx=-$10M.  (We  may  neglect  the  $0.01  processing  cost  for  the 
explosion.) 

The  cost,  mostly  political  and  strategic,  for  misidentifying  an  X  as  a  Q,  cqx,  may  be  taken  as 
being  of  the  same  order  as  cxx.  For  this  example,  we  take  it  to  be  either  $2M  or  $10M. 

The  costs  for  identifying  Q  as  X  are  mostly  political  and  are  usually  corrected  by  further  anal¬ 
ysis;  we  choose  cxq=$lM. 

The  cost  of  an  event  being  unidentified,  U,  is  lower  than  being  misidentified,  because  it  may 
be  assumed  that  further  analysis  will  lead  to  a  correct  decision;  and  because  that  uncertainty  will 
help  to  properly  hedge  political  or  tactical/strategic  decisions.  Failure  to  identify  Q  is  less  serious 
than  failure  to  identify  X,  and  so  we  take  cux=$lM,  cuq=0.2M. 

The  foregoing  may  be  summarized  in  Table  2. 

Table  2:  Possible  Cost  ($M)  Matrix  for  Realistic  Case 


pq=.999,  px=.001,  mux=1.0,  muq=-1.0;  sigx=0.5;  sigq=0.5 


Action 

Decide:X 

Decide:Q 

U 

Event 

X 

o 

1 

|!< 

8 

cqx-2,10 

cux=l 

Type 

Q 

cxq=l 

cqq=.01 

cuq=.2 

The  results  of  these  calculations  are  that  for  cqx=$2M,  the  threshold  between  Q  and  X  is  at 
x=0.56,  approximately  1  standard  deviation  to  the  left  of  the  mean  of  the  X  population,  Q:0.56:X. 
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For  cqx=$10M,  the  threshold  shifts  slightly  and  a  U  region  appears:  Q:0.4:U:0.54:X.  We  see  that 
the  cost  data  have  determined  that  the  threshold  be  set  so  that  there  is  approximately  a  80-90% 
probability  of  detecting  the  explosion.  For  both  cases,  the  probability  of  a  false  alarm  is  approxi¬ 
mately  0.001,  corresponding  to  approximately  one  false  alarm  per  year. 

The  expected  total  cost  for  1000  such  events  at  the  determined  thresholds  is  $3.1  M  for 
cqx=$2M  and  $4.3M  for  cqx=$10M.  Thus,  we  see  that  the  benefits  do  not  fully  cancel  the  costs 
for  this  set  of  discriminants.  A  better  set  of  discriminants,  e.g.,  smaller  standard  errors,  would 
result  in  benefits  more  nearly  cancelling  costs. 

TEST  SITES 

If  one  is  concentrating  on  a  test  site,  then  one  often  has  experience  suggesting  that  the  prior 
probability  of  X  is  substantially  greater  than  Q.  So,  plausibly  for  such  a  case,  we  choose 
px=0.909  and  pq=0.0909  (a  10:1  ratio).  Using  parameters  similar  to  those  in  Table  1,  we  choose 
for  a  test  site:  mux=T,  muq=-l,  sigx=sigq=0.5.  This  corresponds  to  good  discrimination. 

Then  we  choose  cxq=cqx=0.5.  These  correspond  to  a  high  cost  for  misidentification,  as  in 
Table  1. 

Then  we  choose  cuq=cux=0.25.  These  correspond  to  a  fairly  high  cost  for  an  unidentified 
event  as  compared  to  Table  1  because,  at  a  test  site,  unidentified  events  are  more  serious  since  the 
presumption  is  that  an  event  is  a  test. 

As  for  some  cases  in  Table  1 ,  we  choose  cxx=-0.5,  there  being  a  substantial  benefit  for  identi¬ 
fication  of  an  explosion.  However,  we  do  not  choose  a  benefit  equal  to  the  total  cost  of  the  moni¬ 
toring  system  because  the  test  site  represents  only  a  small  portion  of  total  monitoring.  For  a  test 
site,  there  is  also  substantial  benefit  for  correct  identification  of  Q,  because  it  helps  to  defuse  pos¬ 
sible  political  conflicts;  we  choose  cqq=-0.25. 

With  these  parameters  there  is  a  simple  Q:X  decision  point  at  x=-0.32,  and,  as  it  happens,  no 
U.  The  decision  point  is  further  to  the  left,  toward  the  Q  population,  than  any  similar  decision 
point  in  Table  1 ;  to  attain  minimum  cost  there  is  a  tendency  to  lean  toward  deciding  that  an  event 
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is  X.  Because  of  the  benefits  of  a  high  probability  of  a  correct  identification  of  X,  the  cost  is  less 
than  for  any  example  in  Table  1 ;  the  cost  is  -0.47. 

For  any  particular  test  site,  it  is  clear  that  parameters  other  than  those  chosen  here  may  be 
more  suitable,  and  the  decision  points  for  minimum  cost  may  be  elsewhere. 

UNEQUAL  VARIANCE  FOR  Q  AND  X 

Column  3  of  Table  1  is  for  the  case  of  unequal  variance  for  the  Q  and  X  populations.  The  sub¬ 
plots,  Figures  3a-d,  correspond  to  Figures  2a-d  except  for  the  difference  in  X  variance;  i.e.,  the 
standard  error  for  X  is  0.5  instead  of  1.0. 

Inspection  of  Figure  3a  and  the  second  row,  third  column  of  Table  1,  shows  that  instead  of 
having  sequential  Q:U:X  regions  as  x  increases,  we  now  have  sequential  Q:U:X:U:Q  regions.  We 
are  in  the  realm  of  the  quadratic  discriminant  where  the  region  resulting  in  identical  decisions  is 
not  necessarily  a  single  connected  space.  Basically,  this  increase  in  number  of  regions  occurs 
because,  for  large  x,  the  larger  value  of  the  standard  error  for  Q  ensures  that  Pq>Px  even  though 
the  mean  of  the  X  population  is  greater  than  the  mean  of  the  Q  population.  As  would  be  expected, 
since  the  smaller  standard  error  for  X  results  in  better  discrimination,  the  cost  is  less  than  the  cost 
when  the  standard  error  was  1.0  for  both  Q  and  X. 

Again  we  may  note  in  Figure  3b  that,  as  the  benefit  for  detecting  X  is  large,  Rx  increases  in 
size  as  compared  to  Figure  3a. 

And  in  Figure  3  c,  where  the  prior  on  Q  has  increased,  Rq  increases  as  compared  to  Figure  3  a. 
In  this  case,  due  to  the  larger  standard  error  for  Q,  there  is,  in  fact,  no  Rx-  The  only  choice  is 
between  Q  or  U.  The  probability  of  detection,  d,  is  0.0.  If  the  discriminant  under  consideration 
were  the  only  discriminant  available,  then  one  would  have  to  examine  each  U  by  other  means  if 
one  were  to  have  any  hope  of  detecting  an  X. 

Where  there  is  both  a  large  benefit  for  detecting  X  and  a  large  prior  for  Q,  we  have  a  result  in 
Figure  3d  intermediate  between  Figures  3b  and  3c,  and  there  exists  an  Rx. 
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MULTIPLE  POPULATIONS  EXAMPLE:  Q  AND  X  PLUS  MINING  (M) 

We  now  address  the  question  of  extending  the  ECC  approach  to  more  than  two  populations. 
We  shall  see  that  it  is  actually  a  simple  matter:  all  that  is  required  is  to  continue  the  approach  of 
evaluating  each  point  cost  expression  (1)  and  defining  that  range  of*  where  the  kth  is  a  minimum 
as  Rfc. 

We  shall  generalize  the  Q  and  X  populations  by  adding  a  mining  event  population,  M,  with 
mean,  0.0,  between  those  of  the  Q  (-1.0)  and  X  (+1.0)  means.  As  noted  previously,  it  is  necessary 
to  define  a  number  of  additional  costs.  Here,  in  Table  3,  as  an  illustration  we  give  detailed  plausi¬ 
ble  estimates  of  these  costs 

It  is  useful,  in  thinking  about  these  costs,  to  categorize  them  into  monitoring  and  political 
costs  and,  within  these  categories,  to  further  subdivide  into  short-term  and  long-term  costs.  The 
total  cost  for  each  conditional  cost  is  the  sum  over  short-  and  long-term  costs,  and  over  monitor¬ 
ing  and  political  costs. 

Consider,  for  example,  the  first  row  in  Table  3  which  gives  values  for  cqx  (c(Q|X)):  the  cost 
for  misclassifying  an  explosion  as  an  earthquake.  Since,  in  operations,  it  may  not  be  known 
immediately  that  there  has  been  a  mistake,  one  may  imagine  that  there  is  no  short-term  cost,  either 
monitoring  or  political.  The  detection  of  an  earthquake  does  not  provoke  any  special  studies  in  a 
monitoring  system  and  the  political  system  is  indifferent.  Thus,  both  short  term  entries  for  cqx 
are  set  to  0.0  in  Table  3. 


Table  3:  Decision  Costs  for  Mining/Seismic  Region 


Cost  Type 

Monitoring  System 

Political  System 

Sum 

Short 

Term 

Long 

Term 

Short 

Term 

Long 

Term 

cqx 

+0.0 

+0.5 

+0.0 

+0.5 

+1.0 

cxq 

+0.1 

+0.5 

+0.1 

+0.5 

+1.2 

cux 

+0.1 

+0.1 

+0.0 

+0.2 

+0.4 

cuq 

+0.1 

+0.1 

+0.0 

+0.1 

+0.3 

(continued  on  next  page) 
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Table  3:  Decision  Costs  for  Mining/Seismic  Region  (Continued) 


Cost  Type 

Monitoring  System 

Political  System 

Sum 

Short 

Term 

Long 

Term 

Short 

Term 

Long 

Term 

cxx 

+0.1 

+0.0 

-0.2 

-0.5 

-0.6 

cqq 

+0.01 

+0.0 

+0.0 

+0.0 

+0.01 

cqm 

+0.1 

+0.1 

+0.0 

+0.2 

+0.4 

cxm 

+0.1 

+0.25 

+0.1 

+0.25 

+0.7 

cum 

+0.1 

+0.1 

+0.0 

+0.2 

+0.4 

cmx 

+0.0 

+0.5 

+0.0 

+0.5 

+1.0 

cmq 

+0.1 

+0.1 

+0.0 

+0.2 

+0.4 

cmm 

+0.01 

+0.0 

+0.0 

+0.0 

+0.01 

On  the  other  hand,  in  the  long-term,  there  may  be  large  political  costs  for  having  missed  the 
detonation  of  a  nuclear  test;  and  once  the  fact  that  an  apparent  earthquake  was  actually  a  nuclear 
test  is  discovered,  the  monitoring  system  is  likely  to  devote  a  substantial  amount  of  resources  to 
studying  the  situation  and  to  implementing  means  to  avoid  a  repetition  of  the  incident. 

Thus,  it  is  reasonable  that  the  long-term  cqx  costs  would  be  nonzero  for  both  monitoring  and 
political  systems;  we  set  them  equal  to  0.5.  We  shall  discuss  in  a  subsequent  section  the  relations 
of  these  cost  numbers  to  plausibly  realistic  dollar  budgets. 

The  second  row,  for  cxq,  is  much  the  same  except  that  there  is  immediate  short-term  work  cre¬ 
ated  for  both  the  monitoring  system  and  the  political  system  when  an  explosion  is  thought  to  have 
been  detected. 

The  values  entered  in  the  third  row,  cux,  reflect  the  idea  that  having  an  explosion  determined 
to  be  unidentified  is  not  as  costly  as  being  identified  as  an  earthquake,  and  also  reflect  the  idea  that 
there  is  some  immediate  work  generated  in  the  monitoring  system  when  an  event  is  unidentified. 

As  discussed  previously,  cuq  is  plausibly  less  costly  than  cux. 
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The  large  negative  costs  (benefits)  in  the  political  system  for  the  fifth  row  of  Table  3,  cxx, 
reflect  the  idea  that  correct  identification  of  explosions  is  the  principle  goal  of  the  monitoring  sys¬ 
tem  and  each  such  rare  event  represents  a  payoff  which  should  be  at  least  comparable  to  the  total 
cost  of  the  system. 

For  the  sixth  row  of  Table  3,  cqq,  we  see  that  the  political  costs  are  0.0.  There  is  no  political 
interest  in  the  correct  identification  of  earthquakes.  However,  from  the  monitoring  point  of  view, 
the  chief  activity  of  the  monitoring  system  is  the  correct  identification  of  earthquakes  so  that  one 
may  be  sure  that  they  are  not  explosions.  The  total  expense  of  the  system  should  be  allocated  over 
all  the  events  analyzed.  Most  of  these  events  are  earthquakes  and  so  a  small,  short-term  cost, 
0.01,  is  allocated  to  such  events.  (This  cost  may  actually  be  regarded  as  present  as  a  short-term 
monitoring  cost  in  each  of  the  other  categories  so  that,  for  example,  the  total  0.5  monitoring  cost 
for  cqx  should  be  regarded  as  being  made  up  of  0.01  routine  costs  and  0.49  extraordinary  costs. 
The  differences  in  thresholds  resulting  from  a  cost  of  0.49  as  compared  to  0.50  are,  of  course, 
negligible.) 

The  small  monitoring  cost  per  event  comes  significantly  into  play  only  when  (1)  there  are  no 
other  extraordinary  costs,  as  in  cqq  and  cram;  and  (2)  at  the  same  time,  there  are  a  great  number  of 
such  events.  This  combination  is  often  the  case  for  small  earthquakes. 

It  is  reasonable  to  assign  substantial  costs  to  cqm,  cum,  and  cmq,  because  confusion  as  to 
whether  an  event  is  a  mine  event  or  not  makes  it  difficult  to  see  clearly  what  is  happening  and  thus 
reduces  confidence  in  the  discrimination  procedures. 

Additional  rationales  for  the  costs  in  Table  3  should  be  apparent  by  analogy  to  the  paragraphs 
above.  More  detailed  discussions  might  only  be  useful  in  the  context  of  an  actual  application. 

With  the  set  of  costs  from  Table  3,  Figures  4a-d  and  Table  4  give  results  for  several  cases  of 
interest  involving  mining  events. 
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Table  4:  Mine  Monitoring  Scenarios  of  Interest 


Fig 

For  each  case:  muq— 1.0,  mux=+1.0, 
mum=0,  sigx=0.5,  sigm=0.5,  px=l,  pm=10 

4a 

sigq=.5,  pq=l 

Q:-1.06:M:.88:X 

c=.7l  ,fq=.000\,frn=039,  d=.  59 

4b 

sigq=l,pq=l 

Q:-1.22:M:.88:X:3.2:U:3.32:Q 
c=J\,fq=.03,fin=.039,  d=.  59 

4c 

sigq=l,  pq=10 

Q:-.6:M:.94:U:  1 .02:X:2.56:U2.76:Q 
C=2.S2,fq=.02,fin-.02,  d=  48 

4d 

sigq=. 5,  pq=10 

Q:-.48:M:.88:X 

C=l.75,fq=.000\,Jm=039,  d=.  59 

Table  Notes:  As  examples,  Q:-1.06:M:.88:X 
indicates  that  Q  is  identified  for  x<  -1.06,  M  is 
identified  for -1.06  <x<  0.88,  and  Q  is  identified 
for  x  >  +.88;  pm  is  the  prior  probability  for  M;  c 
is  cost;  fq  is  false  alarm  rate  due  to  Q;frn  is  false 
alarm  rate  due  to  M;  and  d  is  probability  of  detec¬ 
tion.  All  costs  are  from  Table  3. 

In  Table  4,  the  “priors”  for  X  and  M  are  taken  as  1  and  10,  respectively.  Although  true  priors  are 
correctly  defined  as  probabilities  which  sum  to  1.0  over  all  possibilities  for  a  single  event,  in  this 
example  we  multiply  the  true  priors  by  the  total  number  of  events  in  the  scenario.  Using  these  “gen¬ 
eralized”  priors,  the  resulting  cost,  c,  is  then  the  expected  cost  for  the  scenario  and  not  the  expected 
cost  for  a  single  event.  We  shall  see  that  this  approach  is  useful  when  we  examine  the  changes  of 
thresholds  and  costs  due  to  the  increase  in  the  number  of  events  of  interest  as  magnitude  decreases. 


As  previously  noted,  we  represent  the  population  of  mining  explosions,  M,  by  a  normal  distribu¬ 
tion  with  mean,  mum=0,  midway  between  the  populations  of  Q  and  X  with  means  at  -1  and  1, 
respectively.  The  standard  errors  for  M  and  X  are  taken  as  0.5,  reflecting,  generally  speaking,  good 
discrimination. 
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The  first  row  in  Table  4,  with  cost  functions  graphed  in  Figure  4a,  may  be  viewed  as  a  case 
wherein  a  particular  mine  or  mining  region  is  being  monitored,  and  near  which  the  relative  activ¬ 
ity  of  nuclear  explosions  and  earthquakes  are  comparatively  low  as  compared  to  the  mining 
events.  The  discrimination  capability  for  the  earthquakes  is  taken  also  to  be  rather  good,  compa¬ 
rable  to  that  of  M  and  X;  sigq=0.5. 

We  see  that  the  cost  is  c=0.71,/#=.0001,>j- 039,and  d=  0.59,  where  fq  is  the  false  alarm  rate 
in  which  Q  are  misidentified  as  X,  and  fin  is  the  false  alarm  rate  in  which  M  is  misidentified  as  X. 

By  comparison,  if  we  set  pm=0,  to  revert  to  the  simple  Q  and  X  case,  but  with  the  complex  set 
of  costs  in  Table  3,  we  find  a  single  decision  line  such  that  c=-0.52,/q=0.02,  and  d= 0.98.  Clearly, 
introducing  the  mining  population  has  severely  degraded  discrimination  and  raised  the  cost,  given 
that  Q  versus  X  discrimination  capability  remains  the  same,  and  that  the  same  number  of  Q  and  X 
events  are  analyzed. 

The  second  row  in  Table  4  (Figure  4b)  shows  that  even  if  the  standard  error  of  Q  is  increased, 
sigq=l  .0  (that  is,  discrimination  capability  for  Q  is  poor),  there  is  little  effect  on  the  cost.  This  is 
because  the  main  effect  of  the  increase  in  sigq  is  to  make  it  more  difficult  to  discriminate  between 
M  and  Q.  Since  there  are  many  more  Q  events,  the  decision  point  moves  from  x=-l  .06  to  -1 .22  so 
that  there  are  not  too  many  cmq  costs  inflicted.  Most  of  the  cost  is  due  to  overlap  between  the  M 
and  X  populations,  and  this  is  not  changed  by  the  increase  in  the  standard  error  for  Q.  It  is  diffi¬ 
cult  to  identify  explosions  at  a  mine. 

The  third  row  in  Table  4  (Figure  4c)  shows  that  if,  in  addition  to  increasing  the  variance  of  Q, 
we  increase  the  priors  so  that  there  are  equal  numbers  of  Q  and  M,  then  there  is  a  substantial  effect 
on  cost.  This  is  plausible  because  one  can  no  longer  appeal  to  the  greater  number  of  M,  as  in  the 
paragraph  above,  to  allow  shifting  the  decision  line  toward  Q.  So  one  must  absorb  the  many  cqm 
and  cmq  costs.  Thus,  a  mining  site  in  an  active  seismic  area  will  be  more  costly  to  monitor  than 
the  same  site  in  an  aseismic  region,  as  is  plausible,  given  the  specified  costs. 

Finally,  if  we  attempt  to  improve  the  situation  found  in  the  previous  paragraph  by  reducing  the 
standard  error  of  Q  from  1 .0  to  0.5,  thus  improving  discrimination  for  Q,  the  fourth  row  in  Table  4 
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and  Figure  4d  show  that  indeed  the  cost,  c,  is  noticably  reduced;  but  we  still  cannot  recover  to  the 
low  costs  of  the  low-seismicity  regions,  where  pq=l,  with  sigq=0.5  or  1.0;  the  costs  due  to  M  can 
not  be  completely  overcome. 

Note  that  in  all  these  scenarios  the  standard  goodness  parameters,/ and  d,  remained  close  to 
the  same  values.  Thus,  we  see  that  these  parameters  do  not  tell  the  whole  story. 

Note,  however,  that  if  in  the  case  of  Figure  4d,  we  reduce  the  costs  of  mistaking  Q  for  M  or  M 
for  Q  (thus  effectively  to  some  degree  lumping  the  two  populations  into  one),  the  overall  costs  are 
reduced,  and  the  thresholds  are  stable  down  to  values  of  cqm=cmq=0.01.  Such  a  set  of  costs 
would  be  suitable  if  one  felt  that  there  were  a  number  of  types  of  events  and  that  it  was  not  impor¬ 
tant  to  discriminate  between  them;  it  is  only  a  requirement  to  be  sure  that  they  are  not  X.  Such  an 
approach  has  been  discussed  from  the  outlier  point  of  view  by  Gray  et  al.  (1996). 
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AN  ANALOGY  TO  OUTLIER  ANALYSIS:  UNIFORM  PRIORS  FOR  X: 

In  many  areas  of  interest  it  may  be  that  there  are  no  sample  events  from  the  X  population.  In 
this  case,  we  may  want  to  make  a  less  definite  estimation  of  the  distribution  of  X.  Perhaps  we 
could  assume  that  the  prior  distribution  of  X  is  uniform  over  the  range  of  the  discrimination 
parameter. 

First  let  us  consider,  in  Figure  5a  and  in  row  1  of  Table  5,  a  set  of  parameters  which  we  have 
considered  previously,  (Row  2,  Table  1).  We  see  that,  as  before,  Q  is  identified  on  the  left,  U  in 
[-0.7, +0.7],  and  X  to  the  right. 


Table  5:  Uniform  Prior  for  X/Outlier  Analysis 


Fig 

For  each  case:  muq=-1.0,  sigq=1.0, 
mux=l  .0,  px=0.5,  pq=0.5,  cxq=0.5, 
cqx=0.5,  cuq=.l,  cux=.l,  cxx=0,  cqq=0 

5a 

sigx=1.0 

Q:-0.7:U:+0.7:X 

O.056f=.044,d=.62 

5b 

X  assumed  uniform  in  [-5,5],  sigx(real)=1.0 
X>3.34:U:1.36:X 

O. 077,  cr=M5,f=.0l9,fr=. 009,  d—.53,  dr=.36 

5c 

X  assumed  uniform  in  [0,5],  sigx(real)=1.0 
Q:0.04:U:1.04:X 

C=.023,  cr=.07l,f=.02l,fh=.021,  d=. 40,  dr=. 48 

5d 

X  assumed  uniform  in  [-5,5],  sigx(real)=0.5 
.  X:-3.34:U:1.36:X 

0.077,  Cr=092,f=0\ 9,fr=.  009,  d=. 53,  dr=. 23 

Table  Notes:  As  examples,  Q:-0.7:U:.0.7:X  indi¬ 
cates  that  Q  is  identified  for  x<  -0.7,  event  is  U 
for  -0.7  <  x  <  0.7,  and  Q  is  identified  for  x  >  +0.7; 
pq  is  the  prior  probability  for  Q;  C  is  cost;  cr  is 
real  cost,  with  uniform  prior  for  X  assumed,  if 
real  distribution  is  normal;/is  false  alarm  rate 
due  to  Q;  d  is  probability  of  detection. 

Let  us  suppose,  in  Figure  5b  and  in  row  2  of  Table  5,  that  the  X  population  is  replaced  by  a 
uniform  distribution  in  [-5,  5].  We  then  see  that  the  U  region  is  greatly  enlarged,  [-3.34, 1.36], 
there  is  no  Q  region,  and  the  X  region  exists  both  to  the  right  and  left.  (The  absence  of  a  Q  region. 
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as  seen  here,  is  not  a  general  result.  If  cuq  >0.1,  instead  of  cuq=0.1  in  this  case,  then  there  would 
be  a  Q  region  symmetrical  about  x=- 1 .  Then,  as  x  further  departed  from  -1.0,  there  would  be  sym¬ 
metrical  U  regions  together  with  the  remnants  of  the  X  regions  seen  in  Row  2.) 

Except  for  the  existence  of  the  U  regions,  this  result  seems  very  similar  to  those  resulting  from 
the  outlier  techniques  developed  by  Fisk  et  al.  (1996),  and  applied  by  Taylor  and  Hartse  (1997). 
The  present  approach  seems  to  add  some  flexibility  to  the  outlier  approach,  making  it  possible  to 
consider  costs  and  priors;  and  to  specify  the  region  of  discrimination  space  in  which  X  may  lie,  if 
such  information  exists,  instead  of  simply  saying  that  it  must  lie  outside  some  specified  popula¬ 
tions  with  some  probability. 

We  can  see,  in  Table  5,  the  kinds  of  qualitative  behavior  which  one  would  expect  when  com¬ 
paring  the  “outlier”  approach  to  an  approach  where  it  could  be  assumed  that  there  was  some 
detailed  knowledge  of  the  X  distribution. 

Comparing  row  1  to  row  2  (Figures  5a  and  5b)  in  Table  5,  we  see  that  as  a  result  of  the 
assumption  of  a  uniform  distribution  for  X,  the  overall  cost  has  increased  from  c=0.056  to 
c=0.077.  This  latter  cost  is  calculated  on  the  basis  of  the  assumption  that  the  true  distribution  of  X 
is  uniform  in  [-5  5].  If  we  assume  that  the  true  distribution  for  X  is  as  seen  in  row  1,  N(l,l),  the 
real  cost,  cr,  for  the  threshold  calculated  assuming  a  uniform  distribution,  may  be  calculated 
(using  Matlab  routine  realcostm)  to  be  cr=0.085.  This  is  greater  than  the  optimum  cost  which 
one  would  have  with  perfect  information,  and  also  more  than  the  cost  if  a  uniform  distribution 
were  true. 

The  outlier  technique,  because  it  makes  no  assumptions  about  the  X  population,  controls  only 
the  false  alarm  rate,/  We  see  that,  for  the  minimum  cost  solutions, /has  decreased  from  .044  to 
.019  in  transitioning  from  a  known  normal  distribution  to  a  uniform  distribution,  even  though  the 
cost  has  increased.  (Note  that  for  a  uniform  distribution,  false  alarms  occur  on  both  tails  of  the  Q 
distribution.)  But  the  probability  of  detection  did  decrease  from  d=0.62  to  0.53,  and  this  would 
lead  to  increased  costs. 

For  the  true  distribution,/;  the  “real”  false  alarm  rate,  is  calculated  on  only  one  tail  of  the  Q 
distribution  and/=0.009,  and  dr  is  less  also,  with  a  value  dr=0.36. 
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The  most  important  point  in  contrasting  rows  1  and  2  of  Table  5  is  that  the  true  cost  increase, 
due  to  the  assumption  of  a  uniform  distribution  of  X,  is  from  0.056  to  0.085.  Of  course,  other  sets 
of  conditional  costs  than  those  given  in  Table  5  would  result  in  different  cost  increases.  Another 
way  of  looking  at  this  result  is  that  the  outlier  technique  controls  only  the  false  alarm  rate,  it  gives 
no  prediction  as  to  the  probability  of  detection. 

Row  3  in  Table  5,  and  Figure  5c  show  how  the  uniform  distribution  could  be  restricted  only  to 
“reasonable”  regions  of  x.  In  this  case,  by  way  of  example,  we  assume  a  uniform  distribution  in 
[0, 5].  (Perhaps  the  region  should  extend  a  bit  into  negative  x,  considering  the  tails  of  the  true  X 
distribution,  but  we  will  pass  over  such  details  here.)  Note  that  the  resulting  decision  regions  are 
much  more  similar  to  the  “correct”  decision  lines  seen  in  5a  than  to  the  decision  lines  derived 
assuming  a  uniform  distribution,  as  seen  in  5b.  We  see  that  the  costs  decrease  as  compared  to  the 
previous  case  where  there  was  a  uniform  distribution  in  [-5, 5].  A  similar  procedure  might  be  use¬ 
ful  in  multivariate  spaces  if  “reasonable”  regions  could  be  determined  there  also. 

Row  4  in  Table  5  (Figure  5d,  corresponding  to  Row  4,  is  identical  to  Figure  5b)  shows  that  if 
the  real  distribution  for  X  is  even  narrower  (more  distinct  from  the  uniform  assumption  of  Row  2) 
then  cr  increases  from  0.085  to  0.092. 

In  these  analyses  we  have  assumed  that  cxx=0.  If  cxx<0  so  that  there  is  a  benefit  to  detecting 
X,  then  Cx  decreases,  and  the  unidentified  region  in  Figure  5d  decreases  so  that  there  are  more  X 

detected. 

Consideration  of  these  cases  suggests  that  the  ECC  technique  has  the  capability  to  reasonably 
handle  cases  when  there  is  little  detailed  knowledge  of  the  distribution  for  X,  the  situation  for 
which  the  outlier  technique  was  designed. 

If  we  know  more  about  the  distribution  of  X,  say  that  its  population  lies  completely  to  the 
right  of  x=0,  then  we  should  be  able  to  improve  the  costs.  Row  3  (Figure  5c)  as  compared  to  row 
2  of  Table  5,  shows  that  cr  does  decrease  in  this  case  from  0.085  for  X  in  [-5,5]  to  0.071  for  X  in 
[0,  5]. 
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Use  of  a  uniform  distribution  could  also  be  useful  in  multivariate  discrimination  where  along 
some  dimensions  the  discriminant  is  well  determined  for  both  Q  and  X,  (e.g.,  Ms\m^  and  possibly 
depth),  whereas  along  another  dimensions  (e.g.,  P/S)  it  is  well  determined  for  Q  but  not  X.  So,  a 
uniform  distribution  would  be  assumed  only  for  X  along  the  P/S  axis.  Thus,  the  same  basic  dis¬ 
crimination  processes  can  be  implemented  in  all  scenarios  of  interest;  the  only  difference  would 
be  in  the  degree  of  specificity  of  the  probability  distributions. 

MONITORING  REALISTIC  SEISMICITY: 

VARIABLE  THRESHOLDS  AS  A  FUNCTION  OF  mb 

As  we  have  seen  in  the  analyses  above,  thresholds  that  result  in  minimum  cost  will  vary  as  a 
function  of  prior  probability.  Since  the  number  of  expected  explosions  may  plausibly  be  regarded 
as  fixed  and  small,  while  the  number  of  earthquakes  and  mining  blasts  increases  as  magnitude 
decreases,  it  is  plausible  that  thresholds  for  minimum  cost  monitoring  will  vary  as  a  function  of 
magnitude. 

Other  reasons  for  imagining  that  the  thresholds  would  vary  as  a  function  of  magnitude  are  that 
the  efficiency  of  discrimination  is  likely  to  decline  as  a  function  of  magnitude  (the  standard  error 
is  likely  to  become  larger  due  to  poorer  S/N  and  fewer  detecting  stations);  and  the  benefits  and 
costs  of  correct  and  incorrect  identification  may  decline  in  absolute  value  as  magnitude  decreases. 

For  the  many  small/unimportant  events,  we  cannot  afford  a  large  expense  per  event  to  further 
process  many  unidentified  events  as  compared  to  the  costs  for  false  alarms,  i.e.,  c(U|Q)  may  be 
very  small  compared  to  c(Q|X)  and  c(X|Q)  for  small  magnitudes.  In  this  case,  to  minimize  total 
cost  for  such  events,  the  unidentified  region  expands,  the  prior  on  X  tends  to  dominate,  and  the 
proper  strategy  is  to  identify  only  those  events  for  which  the  discriminant  is  unequivocal. 

On  the  other  hand,  for  the  few  large/important  events,  we  can  afford  a  greater  expense  to  work 
on  the  unidentified  events,  i.e.,  c(U|Q)  may  be  larger  compared  to  c(Q|X)  and  c(X|Q).  In  this  case, 
the  minimum  total  cost  (including  the  costs  of  work  on  missed  detections  and  false  alarms)  is 
found  by  using  a  smaller  unidentified  region. 
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It  is  worth  noting  that  the  absolute  costs  for  large/important  events  are  likely  larger  than  for 
the  small/unimportant  events.  This  fact  will  come  into  play  when  one  allocates  a  fixed  budget 
between  small  and  large  event  analysis,  but  it  is  only  the  ratio  of  costs  which  comes  into  play  in 
determining  the  undecided  zone  for  a  fixed  magnitude  or  other  level  of  importance. 


To  begin  to  discuss  this  issue.  Table  6  gives  some  figures  which  are  meant  to  be  illustrative 
costs  which  might  be  thought  of  as  realistic  in  some  situations. 

Table  6:  Decision  Costs  as  a  Function  of  Magnitude  (px=l) 


Cost  Type 

3<mb<4,  pq=1000 

4<mb<5,  pq=100 

5>mb,  pq=10 

Monitoring 

Short/Long 

Political 

Short/Long 

Monitoring 

Short/Long 

Political 

Short/Long 

Monitoring 

Short/Long 

Political 

Short/Long 

cqx 

0.00/0.25 

0.00/0.5 

0.0/0.5 

0.0/0.5 

m 

0.0/0.5 

cxq 

0.05/0.125 

0.05/0.125 

0.1/0.25 

0.1/0.25 

0. 1/0.5 

0. 1/0.5 

cux 

0.1//0.1 

0.0/0. 1 

0. 1/0.1 

0.0/0. 1 

0. 1/0.1 

0.0/0.2 

cuq 

0.05/0.0 

0.0/0.0 

0.05/0.05 

0.0/0.05 

0. 1/0.1 

0.0/0. 1 

cxx 

0. 1/0.0 

-0.2/-0.5 

0. 1/0.0 

-0.2/-0.5 

0.1/0.0 

-0.2/-0.5 

cqq 

(0.01/0.00) 

0.0/0.0 

(0.01/0.00) 

0.0/0.0 

(0.01/0.00) 

0.0/0.0 

Among  the  points  of  interest  in  Table  6: 

•  The  prior  probability  for  Q  increases  from  10  to  1000  as  mb  decreases. 

•  The  costs  of  misidentification  decline  as  mb  decreases. 

•  The  benefit  of  correct  identification  of  X  remains  constant  as  mb  decreases. 

•  The  cost  of  misidentifying  X  as  Q  is  4-7  times  greater  than  X  as  U. 

•  cqq  is  0  or  0.01;  if  not  negligible,  then  small  compared  to  all  other  costs,  so: 

•  Identifying  Q  as  U  is  5-10  times  more  costly  than  Q  as  Q;  (If  cqq  not  equal  to  0.0). 

The  results  of  calculations  given  using  the  parameters  of  Table  6  are  given  in  Table  7: 
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Table  7:  System  Thresholds  and  Costs  as  a  Function  of  Magnitude 


For  each  case:  muq=-1.0,  mux=+1.0,  sigm=0.5,  sigx=0.5,  costs:  Table  5 

3<mb<4,  pq=1000 

4<mb<5,  pq=100 

5<mb,  pq=10 

Q:0.6:U:0.74:X 
C=-0.123,/=0.00025  d=  0.70 
cS=9M 

Q:0.4:U:0.5:X 

C=0.28/=.0013,  d=  0.84 
cS=0.72 

Q:0.22:U:0.28:X 

C=-0A2f=QM5,d=.925 

cS=-0.32 

Table  Notes:  As  examples,  Q:0.6:U:.74:X  indicates  that  Q  is  identified  for  x<  0.6,  U  is  identi¬ 
fied  for  0.6  <  x  <  0.74,  and  X  is  identified  for  x  >  0.74.  pq  is  the  prior  probability  for  Q,  c  is  cost 
if  cqq=0,  and  cS  is  the  cost  if  cqq=0.01 .  (Changing  cqq  from  0.00  to  0.01  changes  thresholds 
only  by  approximately  0.01 .  These  threshold  changes  can  be  neglected  for  all  purposes  in  this 
memorandum.)  /is  the  false  alarm  rate  due  to  a  single  member  of  Q.  d  is  probability  of  detec¬ 
tion  of  a  single  member  of  X. 

We  note  that  as  the  Q  prior  increases  from  1 0  to  1 000,  the  U  region  dividing  Q  and  X  moves  to 
larger  values  of  x,  toward  X,  making  it  more  difficult  to  identify  X.  Thus,  the  false  alarm  rate,/ 
and  probability  of  detection  decrease  as  magnitude  decreases.  Note,  however,  that  the  prior-prob¬ 
ability  weighted  false  alarm  rat e,/v  =  pq *f  increases  from  0.05  to  0.25.  This  means  that  in  the 
course  of  1000  low-magnitude  Q  events  (a  year  of  events),  the  expected  number  of  Q  that  would 

r~ 

be  identified  as  X  is  0.25. 

We  note  that  if  cqq=0,  the  cost  of  operation  decreases  as  the  magnitude  decreases,  probably 
due  to  the  fact  that  the  costs  of  misidentification  decrease,  while  cxx  benefits  are  held  constant. 
However,  if  we  set  cqq=0.01  instead  of  0.00,  then  the  system  cost  increases  as  magnitude 
decreases.  For  example,  for  the  lowest  magnitude  range,  the  cost  is  9.86  instead  of  -0.123.  For 
costs  between  0.01  and  0.00,  the  system  cost  would  be  between  9.86  and  -0.123. 

For  cqq=0.01,  the  total  cost,  summing  over  all  magnitudes,  is  10.26.  (Note  that  this  is  almost 
identical  to  10.28,  which  is  equal  to  11.1,  the  cost  for  1 1 10  Q  with  a  cqq  of  0.01,  plus  -0.823,  the 
total  costs  if  cqq=0.  Thus,  almost  the  total  net  effect  of  setting  cqq=0.01  is  simply  to  add  cqq  for 
all  Q.) 

Thresholds  were  also  determined  using  a  uniform  distribution  for  X;  the  major  effect  was  to 
raise  the  thresholds  in  Table  6  by  about  0.2. 
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Using  these  uniform  distribution  thresholds,  costs  were  then  estimated  using  the  “true”  distri¬ 
bution  for  X,  and  the  cost  was  found  to  increase  from  10.26  to  10.66.  This  small,  4%,  increase  in 
cost  reflects  the  dominance  of  cqq  costs  over  all  others.  If  cqq  costs  were  not  so  dominant,  then 
the  costs  of  misclassification  would  lead  to  a  greater  percentage  difference  in  cost.  For  example, 
if  cqq=0,  then  the  change  in  true  costs  would  increase  approximately  from  -0.82  to  -0.44,  roughly 
a  50%  increase  in  cost.  So  the  benefit  of  knowing  the  “true”  distribution  of  X  varies  between  4% 
and  50%  of  total  system  costs  under  these  assumptions. 

It  is  also  worth  noting  that  costs  for  the  lowest  magnitudes  would  greatly  increase  if  the  dis¬ 
crimination  capability  was  substantially  worse  for  the  lowest  magnitudes,  as  may  often  be  the 
case  in  practice.  For  example,  if  sigx=0.707  for  the  lowest  magnitude,  and  cqq=0.0,  then  the  ECC 
cost,  c,  increases  from  -0.123  to  0.464,  and  pd  decreases  from  0.7  to  0.26. 

SUMMARY  AND  SUGGESTIONS  FOR  FURTHER  WORK 

This  memorandum  has  outlined  a  general  procedure  for  discrimination  which  has  most  of  the 
properties  which  experience  has  shown  are  desirable  in  practice.  In  addition,  the  decision  thresh¬ 
olds  emerge  naturally  from  cost  estimates  which  system  managers  are  expected  to  be  prepared,  in 
practice,  to  make. 

The  procedure  may  be  applied  sequentially  to  several  discriminants,  and  some  sort  of  voting 
scheme  applied  to  the  output;  this  is  similar  to  some  present  procedures.  Alternatively,  the  proce¬ 
dure  may  be  easily  generalized  to  multiple  dimensions. 

In  the  former  case,  cost  considerations  might  not  easily  be  carried  through  to  the  final  seismic 
decision;  in  the  latter,  multiple  dimension  case,  they  easily  could  be.  Thus,  a  multiple  dimension 
approach  would  appear  to  be  more  desirable. 

However,  a  plausible  approach  to  a  sequential  procedure  would  be  to  assume  that  a  single  Q  or 
X  identifies  the  event.  Only  if  all  discriminants  decided  U,  would  the  event  be  U.  Then,  plausi¬ 
bly,  all  costs  and  benefits  should  be  the  same  as  in  the  multidimensional  case  except  that  all  costs 
for  identification  of  an  event  as  U  would  be  divided  by  the  number  of  discriminants.  In  this  way, 
if  all  discriminants  resulted  in  U,  then  the  calculated  cost  would  be  correct.  It  would  be  useful  to 
make  this  argument  rigorous. 
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The  next  project  should  be  to  apply  this  approach  to  some  actual  data  resulting  from  operation 
of  an  operational  system.  Most  useful,  perhaps,  would  be  Ms:mb  and  depth  data,  properly  reflect¬ 
ing  the  detectability  of  seismicity,  and  also  assuming  a  uniform  distribution  in  magnitude  for  a 
small  probability  of  explosions,  also  appropriately  weighted  by  detectability.  Another  possible 
discriminant  would  be  a  phase  ratio  and  a  spectral  ratio  data  from  a  monitoring  site  of  interest. 

Another  project  would  be  to  consult  with  system  managers  and  determine  a  definitive  set  of 
costs  and  benefits  for  use  in  actual  applications. 

CALCULATION  ROUTINES 

The  actual  calculations  in  this  memorandum  were  performed  by  several  Matlab  routines  writ¬ 
ten  by  the  author.  The  routine  plfig.m  calculated  the  cost  distributions  for  Q,  U,  and  X,  using 
equation  (1),  detected  which  was  a  minimum  for  each  x  and  thus  calculated  the  identification  sta¬ 
tistic  (ID),  printed  out  the  transition  values  of  x  (which  amounted  to  detection  thresholds),  and 
plotted  the  cost  distributions  and  the  ID  statistic.  Standard  routines  within  plfig.m  were  used  to 
calculate  c,  f,  and  d,  given  the  thresholds. 

The  routine  plfig4.m  performed  the  same  calculations  as  plfig.m  except  that  the  M  population 
was  added.  Both  routines  had  an  option  to  assume  a  uniform  prior  distribution  for  X  in  [-5,  5], 

The  routine  plfig_exp.m  performed  the  same  calculations  as  plfig.m  except  that  it  modeled  the 
X  prior  distribution  as  uniform  between  input  parameters  xleft  and  xright. 

The  routine  realcostm  is  used  to  calculate  the  cost  for  a  set  of  thresholds  and  point  cost  distri¬ 
butions.  The  application  is  to  use  thresholds  calculated  assuming  a  uniform  prior  distribution  for 
X,  but  assuming  some  other,  true,  normal  distribution  for  X. 
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Probability 


Figure  1.  Typical  hypothetical  probability  distributions  for  populations  of  earthquakes  (Q), 
explosions  (X),  and  mining  events  (M)  studied  in  this  memorandum.  The  means  of  the  distribu¬ 
tions  are  centered  either  at  -1, 0,  or  1.  The  broader  distributions  have  standard  errors  of  1 .0,  the 
narrower,  0.5.  The  general  procedure  in  this  memorandum  does  not  depend  on  normality;  normal 
populations  are  used  only  for  convenience. 
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COSTS,  ID 


2a 


2b 


Figure  2.  Plots  of  point  costs  (Equation  1)  and  the  ID  parameter  as  a  function  of  the  discrimina¬ 
tion  variable,  x.  The  lowest  cost  at  each  x  determines  the  ID.  For  example,  in  Figure  2a,  at  x=0, 
Cu  as  indicated  by  the  dotted  line,  has  minimum  point  cost.  So  x=0  is  in  the  U  region.  As  the  ID 
parameter  varies  between  relative  proportions  of  1 :2:3  the  identification  varies  Q:U:X.  Figure  2a 
shows  a  case  where  an  undecided  region  is  derived  for  data  suitable  for  treatment  by  classical  dis¬ 
crimination,  due  to  a  lower  cost  for  “no  decision”  than  for  an  incorrect  decision.  Figure  2b  shows 
how  lowest  cost  results  in  more  X  being  identified  due  to  benefits  (negative  costs)  for  correct  iden¬ 
tification  of  X.  Figure  2c  shows  how  more  Q  are  identified  as  a  result  of  a  high  prior  for  Q.  Fig¬ 
ure  2d  shows  the  counterbalancing  effect  of  benefits  for  identifying  X,  and  a  high  prior  for  Q. 
Detailed  parameters  are  in  Table  1. 
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3a 


3b 


Figure  3.  Results  for  unequal  variance,  (sigq=1.0,  sigx=0.5).  Detailed  parameters  in  Table  1, 
column  3.  Plots  of  point  costs  (Equation  1)  and  the  ID  parameter  as  a  function  of  the  discrimina¬ 
tion  variable,  x.  The  lowest  point  cost  at  each  x  determines  the  ID.  For  example,  in  Figure  3a,  at 
x=-l ,  Cq  as  indicated  by  the  dashed  line,  has  minimum  point  cost.  So  x=-l  is  in  the  Q  region.  As 
the  ID  parameter  varies  between  relative  proportions  of  1 :2:3,  the  identification  varies  Q:U:X. 
Other  than  sigx,  parameters  are  the  same  as  for  Figure  2.  Note  that  due  to  sigq  >  sigx,  Q  is  the 
correct  ID  for  both  small  and  large  x.  However,  only  a  small  total  probability  is  associated  with 
the  large  positive  values  of  x.  Note,  also,  the  total  absence  of  an  X  region  for  Figure  3c;  Cx  is 
nowhere  the  minimum;  the  ID  level  near  x=  1.0  is  U. 
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Figure  4.  Results  for  discrimination  of  Q,  M,  and  X.  Detailed  parameters  and  numerical  results 
in  Tables  3  and  4.  Plots  of  point  costs  (Equation  1)  and  the  ID  parameter  as  a  function  of  die  dis¬ 
crimination  variable,  x.  The  lowest  point  cost  at  each  x  determines  the  ID.  From  the  lowest  to  the 
highest  level  of  the  ID  parameter  we  have  Q:U:M:X.  For  example,  in  Figure  4a,  the  lowest  point 
cost  for  x=0  is  CM,  so  M  is  identified  for  x=0  where  the  ID  parameter  is  at  the  third  level.  The 
principle  result  of  this  analysis  is  that  the  presence  of  M  and  Q  complicates  the  discrimination  of 
either  from  X.  See  the  text  for  detailed  discussion. 
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Figure  5.  Results  for  discrimination  of  Q  and  X  with  a  uniform  prior  distribution  for  X;  an  anal¬ 
ysis  similar  to  outlier  analysis.  Detailed  parameters  and  numerical  results  are  given  in  Table  5. 
Plots  of  point  costs  (Equation  1)  and  the  ID  parameter  as  a  function  of  the  discrimination  variable, 
x.  The  lowest  point  cost  at  each  x  determines  the  ID.  From  the  lowest  to  the  highest  level  of  the 
ID  parameter  we  have  Q:U:X.  For  example,  in  Figure  5a,  the  lowest  point  cost  for  x=0  is  Cy,  so 
U  is  identified  for  x=0  where  the  ID  parameter  is  at  the  second  level.  The  principle  result  of  this 
analysis  is  that  true  costs,  if  a  uniform  distribution  is  assumed  for  X,  are  higher  than  if  one 
assumes  the  true  distribution  for  X.  See  the  text  for  detailed  discussion. 
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OUTLINE  OF  EXPECTED  COST  OF  CLASSIFICATION  (ECC)  THEORY 

The  action,  k,  which  is  performed  in  response  to  a  measured  discrimination  parameter  vector,  x  is 
determined  by  which  action,  k,  has  the  lowest  point  cost,  Chatx. 


Ck(x)  =  X 

i 

This  results  in  minimum  expected  cost  equal  to  the  minimum  Ck  integrated  over  x. 

In  the  formula,  i  represents  true  event  type,  e.g.: 

explosion  X 

earthquake  Q 

mine  blast  M 

and  k  represents  a  possible  action,  e.g.: 

decide  X 

decide  Q 

decide  M 

decide  U  (unidentified,  no  action) 

decide  F  (unidentified,  fly  satellite) 

and 

c(k\i)  are  the  costs  for  action  k,  given  event  type  i,  e.g.,  c(U\X) 
f(x)i  is  the  probability  distribution  of  x  for  event  type  i 
Pi  is  the  prior  probability  of  event  type  i 

To  simplify  to  reach  the  classical  linear  discriminant  one  must  assume: 

(1)  A:  has  same  range,  usually  (1,2),  as  i,  (e.g.,  no  U  action) 

(2)  c(i\i)=0  (e.g.  no  benefits,  no  different  benefits  for  Q  and  X) 

(3)  fj  are  normal  and  have  equal  variance 

Usually  it  is  assumed  that  p,  are  equal  (but  there  are  usually  many  more  Q  than  X). 
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SUMMARY 


The  ECC  method  is  a  classical  statistical  procedure;  calculations  are  simple  and  direct.  The 
method  minimizes  cost  instead  of  maximizing  statistical  power,  and  has  many  features  character¬ 
istic  of  present  procedures. 

The  method  can  consider: 

•  Multiple  actions,  including  undecided 

•  Misclassification  costs 

•  Correct-classification  benefits 

•  System  and  political  costs 

•  Multiple  populations 

•  Unequal  variances/empirical  distributions 

•  Multiple  dimensions 

•  Unequal  priors:  thresholds  may  vary  with  magnitude 

•  Uniform  and  truncated  distributions  if  inadequate  knowledge  for,  e.g.,  X 
Future  work: 

•  Apply  to  actual  data:  e.g.,  Ms:mb  and  depth  in  2D,  including  seismicity 

•  Develop  realistic  costs  for  various  scenarios  via  interviews;  apply 

•  Develop  methods  of  applying  sequentially,  compare  to  multiple  dimensions 

•  Develop  software  system  to  derive  distributions  from  data  and  enter  into  ECC 
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